Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Stress-Testing General Purpose Digital Library Software

Identifieur interne : 000907 ( Main/Exploration ); précédent : 000906; suivant : 000908

Stress-Testing General Purpose Digital Library Software

Auteurs : David Bainbridge [Nouvelle-Zélande] ; H. Witten [Nouvelle-Zélande] ; Stefan Boddie ; John Thompson

Source :

RBID : ISTEX:3E4C0B8C446E0A55D627BB7DF463C7B6FF1EAB7E

Abstract

Abstract: DSpace, Fedora, and Greenstone are three widely used open source digital library systems. In this paper we report on scalability tests performed on these tools by ourselves and others. These range from repositories populated with synthetically produced data to real world deployment with content measured in millions of items. A case study is presented that details how one of the systems performed when used to produce fully-searchable newspaper collections containing in excess of 20 GB of raw text (2 billion words, with 60 million unique terms), 50 GB of metadata, and 570 GB of images.

Url:
DOI: 10.1007/978-3-642-04346-8_21


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Stress-Testing General Purpose Digital Library Software</title>
<author>
<name sortKey="Bainbridge, David" sort="Bainbridge, David" uniqKey="Bainbridge D" first="David" last="Bainbridge">David Bainbridge</name>
</author>
<author>
<name sortKey="Witten, H" sort="Witten, H" uniqKey="Witten H" first="H." last="Witten">H. Witten</name>
</author>
<author>
<name sortKey="Boddie, Stefan" sort="Boddie, Stefan" uniqKey="Boddie S" first="Stefan" last="Boddie">Stefan Boddie</name>
</author>
<author>
<name sortKey="Thompson, John" sort="Thompson, John" uniqKey="Thompson J" first="John" last="Thompson">John Thompson</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:3E4C0B8C446E0A55D627BB7DF463C7B6FF1EAB7E</idno>
<date when="2009" year="2009">2009</date>
<idno type="doi">10.1007/978-3-642-04346-8_21</idno>
<idno type="url">https://api.istex.fr/document/3E4C0B8C446E0A55D627BB7DF463C7B6FF1EAB7E/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000128</idno>
<idno type="wicri:Area/Istex/Curation">000126</idno>
<idno type="wicri:Area/Istex/Checkpoint">000429</idno>
<idno type="wicri:doubleKey">0302-9743:2009:Bainbridge D:stress:testing:general</idno>
<idno type="wicri:Area/Main/Merge">000915</idno>
<idno type="wicri:Area/Main/Curation">000907</idno>
<idno type="wicri:Area/Main/Exploration">000907</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Stress-Testing General Purpose Digital Library Software</title>
<author>
<name sortKey="Bainbridge, David" sort="Bainbridge, David" uniqKey="Bainbridge D" first="David" last="Bainbridge">David Bainbridge</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>Department of Computer Science, University of Waikato, Hamilton</wicri:regionArea>
<wicri:noRegion>Hamilton</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Nouvelle-Zélande</country>
</affiliation>
</author>
<author>
<name sortKey="Witten, H" sort="Witten, H" uniqKey="Witten H" first="H." last="Witten">H. Witten</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>Department of Computer Science, University of Waikato, Hamilton</wicri:regionArea>
<wicri:noRegion>Hamilton</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Nouvelle-Zélande</country>
</affiliation>
</author>
<author>
<name sortKey="Boddie, Stefan" sort="Boddie, Stefan" uniqKey="Boddie S" first="Stefan" last="Boddie">Stefan Boddie</name>
<affiliation>
<wicri:noCountry code="subField">NZ</wicri:noCountry>
</affiliation>
<affiliation>
<wicri:noCountry code="no comma">E-mail: stefan@dlconsulting.com</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Thompson, John" sort="Thompson, John" uniqKey="Thompson J" first="John" last="Thompson">John Thompson</name>
<affiliation>
<wicri:noCountry code="subField">NZ</wicri:noCountry>
</affiliation>
<affiliation>
<wicri:noCountry code="no comma">E-mail: john@dlconsulting.com</wicri:noCountry>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2009</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">3E4C0B8C446E0A55D627BB7DF463C7B6FF1EAB7E</idno>
<idno type="DOI">10.1007/978-3-642-04346-8_21</idno>
<idno type="ChapterID">21</idno>
<idno type="ChapterID">Chap21</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: DSpace, Fedora, and Greenstone are three widely used open source digital library systems. In this paper we report on scalability tests performed on these tools by ourselves and others. These range from repositories populated with synthetically produced data to real world deployment with content measured in millions of items. A case study is presented that details how one of the systems performed when used to produce fully-searchable newspaper collections containing in excess of 20 GB of raw text (2 billion words, with 60 million unique terms), 50 GB of metadata, and 570 GB of images.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Nouvelle-Zélande</li>
</country>
</list>
<tree>
<noCountry>
<name sortKey="Boddie, Stefan" sort="Boddie, Stefan" uniqKey="Boddie S" first="Stefan" last="Boddie">Stefan Boddie</name>
<name sortKey="Thompson, John" sort="Thompson, John" uniqKey="Thompson J" first="John" last="Thompson">John Thompson</name>
</noCountry>
<country name="Nouvelle-Zélande">
<noRegion>
<name sortKey="Bainbridge, David" sort="Bainbridge, David" uniqKey="Bainbridge D" first="David" last="Bainbridge">David Bainbridge</name>
</noRegion>
<name sortKey="Bainbridge, David" sort="Bainbridge, David" uniqKey="Bainbridge D" first="David" last="Bainbridge">David Bainbridge</name>
<name sortKey="Witten, H" sort="Witten, H" uniqKey="Witten H" first="H." last="Witten">H. Witten</name>
<name sortKey="Witten, H" sort="Witten, H" uniqKey="Witten H" first="H." last="Witten">H. Witten</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000907 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000907 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:3E4C0B8C446E0A55D627BB7DF463C7B6FF1EAB7E
   |texte=   Stress-Testing General Purpose Digital Library Software
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024